Skip to content

Fix OOB reads in SoftmaxCrossEntropyLoss via label bounds validation#28004

Open
vraspar wants to merge 1 commit intomainfrom
vraspar/fix-softmax-ce-loss-oob-read
Open

Fix OOB reads in SoftmaxCrossEntropyLoss via label bounds validation#28004
vraspar wants to merge 1 commit intomainfrom
vraspar/fix-softmax-ce-loss-oob-read

Conversation

@vraspar
Copy link
Copy Markdown
Contributor

@vraspar vraspar commented Apr 7, 2026

Description

Fix out-of-bounds reads in SoftmaxCrossEntropyLoss and SoftmaxCrossEntropyLossGrad when label values are outside [0, C).

Same class of bug fixed in #27568 for SparseSoftmaxCrossEntropy. CUDA kernels already have this check.

Changes

  • Validate label values are in [0, C) before indexing (forward + backward), skipping ignore_index
  • Validate weight_shape[0] == C
  • Move weight_data[label] access after ignore_index check in grad weighted paths

Tests

4 regression tests: label too large, negative label, OOB with weights, grad OOB.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens runtime validation to prevent out-of-bounds reads by ensuring indices/sizes are validated before being used for tensor indexing in training loss ops (and also adds additional validation for MatMulBnb4 inputs).

Changes:

  • Add label-value bounds validation for SoftmaxCrossEntropyLoss and SoftmaxCrossEntropyLossGrad (skipping ignore_index) and validate weight_shape[0] == C on CPU.
  • Add regression tests that expect failures for out-of-range labels for SoftmaxCrossEntropyLoss (forward + grad).
  • Add K/N/block_size positivity checks and minimum-size validation for MatMulBnb4 B and absmax inputs (CPU + CUDA), plus new negative tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
orttraining/orttraining/training_ops/cpu/loss/softmax_cross_entropy_loss.cc Adds CPU-side bounds validation for labels and weight shape in SoftmaxCrossEntropyLoss forward/grad paths.
orttraining/orttraining/test/training_ops/cuda/cross_entropy_test.cc Adds new failure-mode tests for out-of-range labels (currently duplicated with new CPU test file).
orttraining/orttraining/test/training_ops/cpu/loss/cross_entropy_test.cc Adds new CPU regression tests for out-of-range labels (currently duplicates CUDA test names).
onnxruntime/contrib_ops/cpu/quantization/matmul_bnb4.cc Adds attribute positivity checks and input-size validation for MatMulBnb4 on CPU.
onnxruntime/contrib_ops/cuda/quantization/matmul_bnb4.cc Adds attribute positivity checks and input-size validation for MatMulBnb4 on CUDA.
onnxruntime/test/contrib_ops/matmul_bnb4_test.cc Adds new negative tests for undersized MatMulBnb4 inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread orttraining/orttraining/test/training_ops/cuda/cross_entropy_test.cc Outdated
Comment thread orttraining/orttraining/test/training_ops/cuda/cross_entropy_test.cc Outdated
Comment thread onnxruntime/contrib_ops/cpu/quantization/matmul_bnb4.cc
Comment thread onnxruntime/contrib_ops/cuda/quantization/matmul_bnb4.cc Outdated
Comment thread onnxruntime/test/contrib_ops/matmul_bnb4_test.cc
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/contrib_ops/cpu/quantization/matmul_bnb4.cc Outdated
Comment thread onnxruntime/contrib_ops/cuda/quantization/matmul_bnb4.cc Outdated
@vraspar vraspar requested a review from adrianlizarraga April 13, 2026 22:13
Add bounds checking for label tensor values in SoftmaxCrossEntropyLoss
and SoftmaxCrossEntropyLossGrad to prevent out-of-bounds memory reads
when processing untrusted ONNX models.

The operators used label_data values as array indices into log_prob_data
and weight_data buffers without validating they fall within [0, C) where
C is the number of classes. A malicious model could embed arbitrary label
values causing heap OOB reads.

Changes:
- Add label value validation in both forward and backward Compute methods
- Add weight tensor size validation (weight_shape[0] == C)
- Move weight_data[label] access after ignore_index check in grad path
  to prevent OOB when ignore_index is outside [0, C)
- Add regression tests for both forward and backward paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vraspar vraspar force-pushed the vraspar/fix-softmax-ce-loss-oob-read branch from f2c15e3 to f20400c Compare April 16, 2026 18:38
@vraspar vraspar requested a review from Copilot April 16, 2026 18:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Validate label values are within [0, C) to prevent out-of-bounds reads.
for (int64_t i = 0; i < N_D; i++) {
if (ignore_index != label_data[i]) {
ORT_RETURN_IF(label_data[i] < 0 || label_data[i] >= C,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move that test in one of the loops below where the label is compared a second time to ignore_index?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move that test in one of the loops below where the label is compared a second time to ignore_index?

I think the separate validation pass is the better approach here, for a few reasons:

  1. Fail-fast: validates all labels before any computation begins, so we never produce partial/corrupt output on bad input.
  2. Parallel safety: the Grad path uses TryParallelFor, and propagating errors out of parallel lambdas is tricky. A separate upfront check avoids that complexity.
  3. Negligible cost: the label scan is O(N_D) vs O(N_D × C) for the main computation, so the extra pass should not be measurable in practice.

That said, I'm open to folding it in if you feel strongly about it. Just wanted to flag the tradeoffs.

@vraspar vraspar requested a review from xadupre April 30, 2026 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants